Online multiclass learning with ”bandit” feedback under a Passive-Aggressive approach

نویسندگان

  • Hongliang Zhong
  • Emmanuel Daucé
  • Liva Ralaivola
چکیده

Abstract. This paper presents a new approach to online multi-class learning with bandit feedback. This algorithm, named PAB (Passive Aggressive in Bandit) is a variant of Online Passive-Aggressive Algorithm proposed by [2], the latter being an e↵ective framework for performing max-margin online learning. We analyze some of its operating principles, and show it to provide a good and scalable solution to the bandit classification problem, particularly in the case of a real-world dataset where it outperforms the best existing algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Boosting with Online Binary Learners for the Multiclass Bandit Problem

We consider the problem of online multiclass prediction in the bandit setting. Compared with the full-information setting, in which the learner can receive the true label as feedback after making each prediction, the bandit setting assumes that the learner can only know the correctness of the predicted label. Because the bandit setting is more restricted, it is difficult to design good bandit l...

متن کامل

New bounds on the price of bandit feedback for mistake-bounded online multiclass learning

This paper is about two generalizations of the mistake bound model to online multiclass classification. In the standard model, the learner receives the correct classification at the end of each round, and in the bandit model, the learner only finds out whether its prediction was correct or not. For a set F of multiclass classifiers, let optstd(F ) and optbandit(F ) be the optimal bounds for lea...

متن کامل

Confusion-Based Online Learning and a Passive-Aggressive Scheme

This paper provides the first —to the best of our knowledge— analysis of online learning algorithms for multiclass problems when the confusion matrix is taken as a performance measure. The work builds upon recent and elegant results on noncommutative concentration inequalities, i.e. concentration inequalities that apply to matrices, and, more precisely, to matrix martingales. We do establish ge...

متن کامل

Efficient Online Bandit Multiclass Learning with Õ(√T) Regret

We present an efficient second-order algorithm with Õ( 1 η √ T )1 regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by η, for a range of η restricted by the norm of the competitor. The family of loss functions ranges from hinge loss (η = 0) to squared hinge loss (η = 1). This provides a solution to the...

متن کامل

Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret

We present an efficient second-order algorithm with Õ( 1 η √ T ) regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by η, for a range of η restricted by the norm of the competitor. The family of loss functions ranges from hinge loss (η = 0) to squared hinge loss (η = 1). This provides a solution to the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015